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Abstract 

One of the crown jewels of complexity theory is Valiant's 1979 theorem that computing the 
permanent of an n x n matrix is ^P-hard. Here we show that, by using the model of linear- 
optical quantum computing — and in particular, a universality theorem due to Knill, Laflamme, 
and Milburn — one can give a different and arguably more intuitive proof of this theorem. 

1 Introduction 

Given an n x n matrix A = (aij), the permanent of A is defined as 

n 

Per (A) = Hn-ii)- 

A seminal result of Valiant [i5\ says that computing Per (A) is #P-hard, if A is a matrix over (say) 
the integers, the nonnegative integers, or the set {0,1} Here t^P means (informally) the class of 
counting problems — problems that involve summing exponentially-many nonnegative integers — and 
^P-hard means "at least as hard as any ^P problem. "HH 

More concretely. Valiant gave a polynomial-time algorithm that takes as input an instance 
If {xi, . . . , Xn) of the Boolean satisfiability problem, and that outputs a matrix A^ such that Per (A<^) 
encodes the number of satisfying assignments of ip. This means that computing the permanent is 
at least as hard as counting satisfying assignments. 

Unfortunately, the standard proof that the permanent is #P-hard is notoriously opaque; it 
relies on a set of gadgets that seem to exist for "accidental" reasons. Could there be an alternative 
proof that gave more, or at least different, insight? In this paper, we try to answer that question 
by giving a new, quantum-computing-based proof that the permanent is ^^P-hard. In particular, 
we will derive the permanent's ^P-hardness as a consequence of the following three facts: 

*MIT. Email: aaronson@csail.mit.edu. This material is based upon work supported by the National Science 
Foundation under Grant No. 0844626. Also supported by a DARPA YFA grant and a Sloan Fellowship. 

^See Hrubes, Wigderson, and YehudayofF [7] for a recent, "modular" presentation of Valiant's proof (which also 
generalizes the proof to the noncommutative and nonassociative case). 

^See the Complexity Zoo (www.complexityzoo.com) for the definitions of #P and other complexity classes used 
in this paper. 

^If y4 is a nonnegative integer matrix, then Per (A) is itself a #P function, which implies that it is jj^P- complete 
(the term for functions that are both #P-hard and in #P). If A can have negative or fractional entries, then strictly 
speaking Per (A) is no longer #P-complete, but it is still #P-hard and computable in the class FP*''. 
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(1) Postselected linear optics is capable of universal quantum computation, as shown in a cele- 



brated 2001 paper of Knill, Laflamme, and Milburn [9J (henceforth referred to as KLM)o 

(2) Quantum computations can encode ^^P-hard quantities in their amplitudes. 

(3) Amplitudes in n-photon linear-optics circuits can be expressed as the permanents of n x n 



Even though our proof is based on quantum computing, we stress that we have made it entirely 
self-contained: all of the results we need (including the KLM Theorem [9], and even the construction 
of the Toffoli gate from 1-qubit and CSIGN gates) are proved in this paper for completeness. 
We assume some familiarity with quantum computing notation (e.g., kets and quantum circuit 
diagrams), but not with linear optics. 

1.1 Motivation 

If one counts the complexity of all of the individual pieces we use — especially the universality 
results for quantum gates — then our reduction from #P to the permanent ends up being at least 
as complicated as Valiant's, and probably more so. In our view, however, this is similar to how 
writing a program in C++ tends to produce a longer, more complicated executable file than writing 
the same program in assembly language. Normally, one also cares about the length and readability 
of the source code\ Our purpose in this paper is to illustrate how quantum computing provides a 
powerful "high-level programming language" in which one can, among other things, easily rederive 
the most celebrated result in the theory of ^P-hardness. 

But why does the world need a new proof that the permanent is #P-hard — especially a proof 
invoking what some might consider to be exotic concepts? Let us offer several answers: 

• Any theorem as basic as the ^^P-hardness of the permanent deserves several independent 
proofs. And our proof really is "independent" of the standard one: rather than composing 
variable and clause gadgetsjj we multiply matrices corresponding to quantum gates, and use 
ideas from linear optics to keep track of how such multiplications affect the permanent. One 
way to see the difference is that our proof never uses the notion of a cycle cover. 

• While our proof, like the standard one, requires "gadgets" (one to simulate a Toffoli gate 
using CSIGN gates, another to simulate a CSIGN gate using postselected linear optics), the 
connection to quantum computing gives those gadgets a natural semantics. In other words, 
the gadgets were introduced for "practical" reasons having nothing to do with proving the 
permanent #P-hard, and can be motivated independently of that goal. // one already 
knows the quantum universality gadgets, then we offer what seems like a major advance in 
complexity-theoretic pedagogy: a proof that the permanent is #P-hard that can be repro- 
duced on-the-spot from memory! 

^KLM actually prove the stronger (and more practically-relevant) result that linear optics with adaptive measure- 
ments is capable of universal quantum computation. For our purposes, however, we only need the weaker fact that 
postselected measurements suffice for universal QC, which KLM prove as a lemma along the way to their main result. 

^Indeed, our proof does not even go through the Cook-Levin Theorem: it reduces a #P computation directly to 
the permanent, without first reducing #P to ^3SAT. 




matrices. 
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• As Kuperberg [10] pointed out, by their nature, any #P-hardness proofs (including ours) that 
are based on "quantum postselection" almost immediately yield hardness of approximation 
results as well. 

• We expect that the quantum postselection approach used here could lead to #P-hardness 
proofs for many other problems — including problems not already known to be ^P-hard by 
other means. In this direction, one natural place to look would be special cases of the 
permanent. 

1.2 Related Work 

By now, there are many examples where quantum computing has been used to give new or sim- 
pler proofs of classical complexity theorems; see Drucker and de Wolf [6j for an excellent survey. 
Within the area of counting complexity, Aaronson p] showed that the class PP is equal to PostBQP 
(quantum polynomial-time with postselection), and then used that theorem to give a simpler proof 
of the landmark result of Beigel, Reingold, and Spielman P] that PP is closed under intersection. 
Later, also using the PostBQP = PP theorem, Kuperberg [lOJ gave a "quantum proof of the result 
of Jaeger, Vertigan, and Welsh [8] that computing the Jones polynomial is #P-hard, and even 
showed that a certain approximate version is ^^P-hard (which had not been shown previously). 
Kuperberg's argument for the Jones polynomial is conceptually similar to our argument for the 
permanent. 

There is also precedent for using linear optics as a tool to prove theorems about the permanent. 
Scheel |14) observed that the unitarity of linear-optical quantum computing implies the interesting 
fact that |Per(C/)| < 1 for all unitary matrices U. 

Rudolph |13j showed how to encode quantum amplitudes directly as matrix permanents, and 
in the process, gave a "quantum-computing proof that the permanent is #P-hard. However, a 
crucial difference is that Rudolph starts with Valiant's proof based on cycle covers, then recasts it 
in quantum terms (with the goal of making Valiant's proof more accessible to a physics audience). 
By contrast, our proof is independent of Valiant's; the tools we use were invented for separate 
reasons in the quantum computing literature. 

There has been a great deal of work on linear-optical quantum computing, beyond the seminal 
KLM Theorem [9] on which this paper relies. Recently, Aaronson and Arkhipov [2j] studied the 
complexity of sampling from a linear-optical computer's output distribution, assuming no adaptive 
measurements are available. By using the #P-hardness of the permanent as an "input axiom," 
they showed that this sampling problem is classically intractable unless P*"^ = BPP'^^. More 
relevant to this paper is an alternative proof that Aaronson and Arkhipov gave for their result. 
Inspired by work of Bremner, Jozsa, and Shepherd [5], the alternative proof combines Aaronson's 
PostBQP = PP theorem \1\ with the fact that postselected linear optics is universal for PostBQP, 
and thereby avoids any direct appeal to the #P-hardness of the permanent. In retrospect, that 
proof was already much of the way toward a linear-optical proof that the permanent is #P-hard; 
this paper simply makes the connection explicit. 

2 Background 

Not by accident, this section constitutes the bulk of the paper. First, in Section [2.H we fix some 
facts and notation about standard (qubit-based) quantum computing. Then, in Section 12.21 we 
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Figure 1: Simulating a TofFoli gate using CSIGN and 1-qubit gates. 

give a short overview of those aspects of linear-optical quantum computing that are relevant for us, 
and (for completeness) prove the KLM Theorem in the specific form we will need. 

2.1 Quantum Circuits 

Abusing notation, we will often identify a quantum circuit Q with the unitary transformation that 
it induces: for example, (0 • • • 0| Q |0 ' ' " 0) represents the amplitude with which Q maps its initial 
state to itself. We use IQI to denote the number of gates in Q. 

The first ingredient we need for our proof is a convenient set of quantum gates (in the standard 
qubit model). Thus, let Q be the set of gates consisting of (1) all 1-qubit gates, and (2) the 2-qubit 
controlled- sign gate 



CSIGN 



which flips the amplitude if and only if both qubits are Then Barenco et al. [3j showed that 

^ is a universal set of quantum gates, in the sense that Q generates any unitary transformation 
on any number of qubits (without error). For our purposes, however, the following weaker result 
suffices. 

Lemma 1 Q generates the Toffoli gate, the 3-qubit gate that maps each basis state \x,y,z) to 
\x, y,z (B xy). 

Proof. The circuit can be found in Nielsen and Chuang [11] for example, but we reproduce it in 
Figure [T] for completeness. In the diagram, 

1 



/ 1 











\ 





1 
















1 







Vo 








-1 


/ 



is the Hadamard gate. 



^^1/ \/2 + ^/2 i^/2-V2 
2 i iV2 - \/2 \/2 + V2 



is another 1-qubit gate, and the six vertical bars represent CSIGN gates. 



®A more common 2-qubit gate than CSIGN is the controlled-NOT (CNOT) gate, which maps each basis state 
\x,y) to \x,y(Bx}. However, CSIGN is more convenient for linear-optics pmposes, and is equivalent to CNOT by 
conjugating the second qubit with a Hadamard gate. 
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2.2 Linear-Optical Quantum Computing 

We now give a brief overview of linear- optical quantum computing (LOQC), an alternative quantum 
computing model based on identical photons rather than qubits. For a detailed introduction to 
LOQC from a computer science perspective, see Aaronson and Arkhipov [2]. 

In LOQC, each basis state of our quantum computer has the form |5) = |si, . . . ,Sm), where 
si, . . . ,Sm are nonnegative integers summing to n. Here Si represents the number of photons in the 
i^^ location or "mode," and the fact that si + ■ ■ ■ + s„i = n means that photons are never created 
or destroyed. One should think of m and n as both polynomially-bounded. For this paper, it 
will be convenient to assume that m is even, that n = m/2, and that the initial state has the form 
|/) = |0, 1,0, 1, . . . ,0, 1): that is, one photon in each even-numbered mode, and no photons in the 
odd-numbered modes. 

Let $rn,n be the set of nonnegative integer tuples 5" = (si, . . . , Sm) such that si + ■ ■ ■ + Sm = n, 
and let T-Lm,n be the Hilbert space spanned by basis states \S) with S G ^m,n- Then a general 
state in LOQC is just a unit vector in 7im,n- 

with Es6*™,„ l«5p = 1- 

To transform {ip), one can select any m x m unitary transformation U = (uij). This U then 
induces a larger unitary transformation ip (U) on the Hilbert space T-Lm,n of n-photon states. There 
are several ways to define (p{U), but perhaps the simplest is the following formula: 

for all tuples S = (si, . . . , Sm) and T = (ti, . . . , tm) in ^m,n- Here Us^t is the nxn matrix obtained 
from U by taking Si copies of the z*^ row of U and tj copies of the j*^ column, for all i,j S [m]. 
To illustrate, if 

■ 1 
-1 



U 



and \S) = \T) = |2, 1), then 



Us,: 




Intuitively, the reason the permanent arises in formula (*) is that there are n! ways of mapping 
the n photons in basis state IS") onto the n photons in basis state |r). Since the photons are 
identical bosons, quantum mechanics says that each of those n! ways contributes a term to the total 
{S\lp [U) |T), with the contribution given by the product of the transition amplitudes Uij for each 
of the n photons individually. 

It turns out that {U) is always unitary and that (/9 is a homomorphism. Both facts seem 
surprising viewed purely as algebraic consequences of formula (*), but of course they have natural 
physical interpretations: (p{U) is unitary because it represents an actual physical transformation 
that can be applied, and 93 is a homomorphism because generalizing from one photon to n photons 
must commute with composing beamsplitters. In this paper, we will not need that f (U) is unitary; 
see Aaronson and Arkhipov [2] for a proof of that fact. Below we prove that 99 is a homomorphism. 
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Lemma 2 (p is a homomorphism. 



Proof. We want to show that for ah tuples S,T £ <&m,n and aU m x m unitaries U, V, 
{S\^{VU)\T) = {S\cp{V)^{U)\T)= ^ {S\^{V)\R){R\^{U)\T). 

By equation (*), the above is equivalent (after multiplying both sides by \/ si\ ■ ■ ■ Sm^.til ■ ■ ■ tm^.) to 
the identity 

Per ((Ft/),,) = Y. (") 

We will prove identity (**) in the special case n = m and S = T = I = (1, 1, . . . , 1), since the 
general case is analogous. We have 

n 

Per {VU) =Y.T{ 



Fei{Vi,R)FeT{URj) 



ri! • • • r„! 



In the second line above, we decomposed the sum by thinking about each permutation cj G 5„ as 
a product of two permutations: one, r, that maps n particles in the initial configuration |/) to 
n particles in the intermediate configuration \R) when U is applied, and another, that maps n 
particles in the intermediate configuration \R) to n particles in the final configuration | /) when V is 
applied. This yields the same result, as long as we remember to sum over all possible intermediate 
configurations R G ^n,n, and also to divide each summand by ri! • • • r„!, which is the size of -R's 
automorphism group (i.e., the number of ways to permute the n particles within \R) that leave \R) 
unchanged) . ■ 

In the standard qubit model, every unitary transformation can be decomposed as a product of 
gates, each of which acts nontrivially on only 1 or 2 qubits. Similarly, in LOQC, every unitary 
transformation can be decomposed as a product of linear-optics gates, each of which acts nontrivially 
on only 1 or 2 modes. Then a linear- optics circuit is simply a list of linear-optics gates applied to 
specified modes (or pairs of modes) starting from the initial state |/) = |0, 1, . . . , 0, 1)0 

The last notion we need is that of postselected LOQC. In our context, postselection simply 
means measuring the number of photons in a given mode i, and conditioning on a particular result 
(for example, photons, or 1 photon). After we postselect on the number of photons in some 



crucial difference between standard quantum circuits and linear-optics circuits is that, whereas a standard quan- 
tum gate is the tensor product of a small (say 4x4) unitary matrix with an exponentially-large (say 2"~^ x 2"~^) 
identity matrix, a linear-optics gate is the direct sum of a small (say 2x2) unitary matrix with a polynomially-la,Tge 
(say (m — 2) x (m — 2)) identity matrix. It is only the homomorphism U ^ (p (U) that produces exponentially-large 
matrices. One consequence, pointed out by Reck et al. [T2j, is that, whereas most n-qubit unitary transformations re- 
quire Q (2^") gates to implement (as follows from an easy dimension argument), every m-mode unitary transformation 
U can be implemented using only O (m^) linear-optics gates. 
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Figure 2: Simulating CSIGN by NSi and Hadamard. 

mode, we will never use that mode for further computationlf] For this reason, without loss of 
generality, we can defer all postselected measurements until the end of the computation. 

Our T^P-hardness proof will fall out as a corollary of the following universality theorem, which 
is implicit in the work of KLM [9]. Indeed, we could just appeal to the KLM construction as a 
"black box," but we choose not to do so, since the properties of the construction that we want are 
slightly different from the properties KLM want, and we wish to verify in detail that the desired 
properties hold. 

Theorem 3 (following KLM [9j) Postselected linear optics can simulate universal quantum com- 
putation. More concretely: there exists a polynomial-time classical algorithm that converts a quan- 
tum circuit Q over the gate set Q into a linear-optics circuit L, so that 



{I\viL)\I) 



(0---0|Q|0---0) 



where T is the number o/ CSIGN gates in Q and \I) = |0, 1, . . . ,0, 1) is the standard initial state. 

Proof. To encode a (qubit-based) quantum circuit by a postselected linear-optics circuit, KLM use 
the so-called dual-rail representation of a qubit using two optical modes. In this representation, 
the qubit |0) is represented as |0, 1), while the qubit |1) is represented as |1,0). Thus, to simulate 
a quantum circuit that acts on k qubits, we need 2k optical modes. (We will also need additional 
modes to handle postselection, but we can ignore those for now.) Let the modes corresponding to 
qubit i be labeled {i,0) and (i, 1) respectively. Notice that the initial state |0---0) in the qubit 
model maps onto the initial state |/) in the optical model. 

Since 99 is a homomorphism by Lemma [21 to prove the theorem it suffices to show how to 
simulate the gates in Q. Simulating a 1-qubit gate is easy: simply apply the appropriate 2x2 
unitary transformation to the Hilbert space spanned by |0, 1) and |1,0). The interesting part is 
how to simulate a CSIGN gate. To do so, KLM use another gate that they call NSi, which applies 
the following unitary transformation to a single mode: 

NSi : ao |0) + ai |1) + 02 |2) ^ oq |0) + oi |1) - 02 |2) . 

(We do not care how NSi acts on |3), |4), and so on, since those basis states will never arise in our 
simulation.) Using NSi, it is not hard to simulate CSIGN on two qubits i and j. The procedure, 
shown in Figure [2l is this: first apply a Hadamard transformation to modes {i,0) and (i, 0). One 



In physics language, all photon-number measurements are assumed to be "demolition" measurements. 
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can check that this induces the fohowing transformation on the state of (^,0) and (j, 0): 



|0,0) 

|i,o) 
|o,i) 

11,1) 



|0,0) 

|1,0) + |o,i) 
|i,o)-|o,i) 

|2,0) - |0,2) 
71 



The key point is that we get a state involving 2 photons in the same mode, if and only if the 
modes (i,0) and (i, 0) both contained a photon. Next, apply NSi gates to both («,0) and (j, 0). 
This flips the amplitude if and only if we started with |1, 1). Finally, apply a second Hadamard 
transformation to (i,0) and (j, 0), to complete the implementation of CSIGN. 

We now explain how to implement NSi on a given mode z, using postselection. To do so, we 
need two additional modes j and fe, which are initialized to the states |0) and |1) respectively. First 
we apply the following 3x3 unitary transformation to i, j, k: 



( 1-V2 



W 



3 

V2 



_3 

V 2174 



%/2- 



_ 1 
2 

1 



1 

21/4 

1 \_ 

2 

1 
2 



Then we postselect on j and k being returned to the state |0, 1). As shown in [9], this postselection 
always succeeds with amplitude 1/2 (corresponding to probability 1/4); and that conditioned on it 
succeeding, the effect is to apply NSi in mode i. To prove this, observe that since the number of 
photons is conserved, the effect of W on mode i must have the form 

ao |0) + ai |1) + a2 |2) Aoao |0) + AiQi |1) + A2a2 |2) , 

for some Ao,Ai,A2. Using formula (*), we then calculate 

Ao = ws-i 

Ai = Per 



1 

2' 

wxx u;i3 \ ^ 1 

W^31 W^33 / 2' 

Wix Wx\ Wn 

A2 = ^ Per I 1^11 wxi Wis 

Ws\ Ws\ Ws3 
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This implies that the CSIGN circuit shown in Figure [2] succeeds with amplitude 1/4 (corresponding 
to probability 1/16), and furthermore, we know when it succeeds. ■ 

In the proof of Theorem [3l the main reason the matrix W looks complicated is simply that it 
needs to be unitary. However, notice that unitarity is irrelevant for our ^P-hardness application — 
and if we drop the unitarity requirement, then we can replace by a simpler 2x2 matrix, such 



as 



Y :-- 



1 



-V2 V2 
1 1 
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To implement NSi on a given mode i, we would apply y to i as well as another mode j that initially 
contains one photon, then postselect on j still containing one photon after Y is applied. One can 
verify by calculation that the effect on mode i is 

ao |0) + ai |1) + 02 |2) Aqoo |0) + Aiai |1) + A2a2 |2) 

where Ao = Ai = 1 and A2 = —1. 



3 Main Result 

In this section we deduce the following theorem, as a straightforward consequence of Theorem [3l 

Theorem 4 The problem of computing Per (A), given a matrix A G Z^^^ o/poly {N)-bit integers 
written in binary, is ^P-hard under many-one reductions. 

In classical complexity theory, one is often more interested in various corollaries of Theorem [D 
for example, that computing Per (^) remains ^^P-hard even if ^ is a nonnegative integer matrix, 
or a { — 1, 0, l}-valued matrix, or a {0, l}-valued matrix. Valiant [l5] gave simple reductions by 
which one can deduce all of these corollaries from Theorem [H We do not know how to use the 
linear-optics perspective to get any additional insight into the corollaries. 

Let C be a classical circuit that computes a Boolean function C : {0, 1}" — >• {—1, 1}, and let 
Ac := Ylxe{o 1}" ^ (^)- Then computing Ac, given C as input, is a #P-hard problem essentially 
by definition. On the other hand, it is easy to encode Ac as an amplitude in a quantum circuit: 

Lemma 5 There exists a classical algorithm that takes a circuit C as input, runs in poly (n, \C\) 
time, and outputs a (qubit-based) quantum circuit Q, consisting of gates from Q, such that 

(0---0|Q|0---0) = ^. 

Proof. Let Dc be a 2" x 2" diagonal unitary matrix whose (x,x) entry is C (x). Then since the 
Toffoli gate is universal for classical computation, a quantum circuit consisting of 1-qubit gates and 
Toffoli gates can easily apply Dc- To do so, one uses the standard "uncomputing" trick: 

\x) |x) \hc (x)) — > C (x) \x) \hc (x)) C (x) \x) , 

where he (x) is the complete history of a computation using Toffoli gates that produces C(x). 
Now let F = Hf^ be the quantum Fourier transform over (i.e., the Hadamard gate applied to 
each of n qubits), and let Q = FDcF. Then 




ryn / — / ; I a/2" — ' / 2'' 
xe{o,i}" / \^ a-6{o,i}" / 

Finally, by Lemma [U we can simulate each of the Toffoli gates in Q using gates from the set Q. ■ 
Let Q be the quantum circuit from Lemma El and assume Q uses k = poly (n, |C|) qubits. By 
Theorem [3l we can simulate Q by a linear-optics circuit L such that 

(/| if (L) \1) = , 
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where T = poly(n, |C|) is the number of CSIGN gates in Q. Furthermore, the circuit L uses 
m := 2k + 4T optical modes. Let U be the m x m unitary matrix induced by L, and let V be 
the (m/2) x (m/2) submatrix of U obtained by taking the even- numbered rows and columns only. 
Then we have 



Per (V) 



{I\viL)\I) 
(0---0|Q|0- 



0) 



4r 



A 



c 



2"4r 



where the first line follows from formula (*) and the third from Lemma [5j Since V can be produced 
in polynomial time given C, this already shows that computing Per (V) to sufficient precision is 
#P-hard. 

However, we still need to deal with the issue that the entries of V are real numbers]^ Let 
b := [log2 (ra!) + 2n + 2r] . Then notice that truncating the entries of y to 6 bits of precision 
produces a matrix V such that 



Per(y) - Per {V) 



< n! 1 



< 



1 - -r 



< 



n! • n 
1 



for sufficiently large n, and hence 



2"4^ Per(y)l = 2"4^ Per (F) = Ac. 

For this reason, we can assume that each entry of V has the form A;/2'' for some integer A; G [—2^, 2^] . 
Now set A := '^V . Then A is an integer matrix satisfying Per (^) = 2''"Per(l/), whose entries 
can be specified using 6 + O (1) = poly (n, |C|) bits each. This completes the proof of Theorem [H 
We conclude by noticing that our proof yields not only Theorem U but also the following 
corollary: 

Corollary 6 The 'problem of computing sgn(Per(A)) := Per (A) /|Per(A)|, given a matrix A S 
I^NxN of poly {N)-bit integers written in binary, is #P-hard under Turing reductions. 

Proof. By the above equivalences, it suffices to show that computing sgn (Ac) is ^^P-hard. This 
is true because, given the ability to compute sgn(Ac), we can determine Ac exactly using binary 
search. In more detail, given a positive integer k, let C [k] denote the circuit C modified to contain 
k additional inputs x such that C (x) = 1, and let C [—k] denote C modified to contain k additional 
x's such that C (x) = — 1. Then clearly 



A 



C[k] 



A, 



C[-k] 



Ac-k. 



^Indeed, the matrices that we multiply to obtain U can be complex matrices, but U itself (and hence the submatrix 
V) will always be real. 
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Thus we can use the following strategy: compute the signs of Ac[i] > ^C[~i] > ^C[2] > ^c[-2] i ^c[4] i ^C[-4] i^-^d 
so on, increasing k by successive factors of 2, until a fc is found such that sgn (^Ac[k]) / sgn (^Ac[2k]) ■ 
At that point, we know that Ac must be between k and 2k. Then by computing sgn {'Ac[3k/2])i 
we can decide whether Ac is between k and 3k/2 or between 3k/2 and 2k, and so on recursively 
until Ac has been determined exactly. ■ 

Corollary [6] implies, in particular, that approximating Per (^4) to within any multiplicative factor 
is #P-hard — since to output a multiplicative approximation, at the least we would need to know 
whether Per (^4) is positive or negative. 

Using a more involved binary search strategy (which we omit), one can show that, for any 
/3 (iV) G [1, poly (iV)], even approximating |Ac| or A^ to within a multiplicative factor of (3 (N) 
would let one compute Ac exactly, and is therefore ^^P-hard under Turing reductions. It follows 
from this that approximating |Per(A)| or Per (^)^ to within a multiplicative factor of P {N) is 
^P-hard as well. (Aaronson and Arkhipov [2] gave a related but more complicated proof of the 
^P-hardness of approximating |Per {A)\ and Per (j4)^, which did not first replace Per (A) with Ac.) 
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